Project-Team:ASAP

Inria | Raweb 2014 | Presentation of the Project-Team ASAP | ASAP Web Site


	PDF	e-Pub

Previous |

Home | Next next

Section: New Results

Models and abstractions for distributed systems

Signature-free asynchronous Byzantine consensus

Participant : Michel Raynal.

In [34] we present a new round-based asynchronous consensus algorithm that copes with up to $t < n / 3$ Byzantine processes, where $n$ is the total number of processes. In addition of not using signature, not assuming a computationally-limited adversary, while being optimal with respect to the value of $t$ , this algorithm has several noteworthy properties: the expected number of rounds to decide is four, each round is composed of two or three communication steps and involves $O (n^{2})$ messages, and a message is composed of a round number plus a single bit. To attain this goal, the consensus algorithm relies on a common coin as defined by Rabin, and a new extremely simple and powerful broadcast abstraction suited to binary values. The main target when designing this algorithm was to obtain a cheap and simple algorithm. This was motivated by the fact that, among the first-class properties, simplicity –albeit sometimes under-estimated or even ignored– is a major one.

This is a joint work with Achour Mostéfaouin and Hamouma Moumen. It received the PODC 2014 Best Paper Award.

Randomized mutual exclusion with constant amortized RMR complexity on the DSM

Participant : George Giakkoupis.

In [30] we settle an open question by determining the remote memory reference (RMR) complexity of randomized mutual exclusion, on the distributed shared memory model (DSM) with atomic registers, in a weak but natural (and stronger than oblivious) adversary model. In particular, we present a mutual exclusion algorithm that has constant expected amortized RMR complexity and is deterministically deadlock free. Prior to this work, no randomized algorithm with $o (log n / log log n)$ RMR complexity was known for the DSM model. Our algorithm is fairly simple, and compares favorably with one by Bender and Gilbert (FOCS 2011) for the CC model, which has expected amortized RMR complexity $O ({log}^{2} log n)$ and provides only probabilistic deadlock freedom.

This is a joint work with Philipp Woelfel (Univ. of Calgary, Canada).

Reliable shared memory abstraction on top of asynchronous Byzantine message-passing systems

Participants : Michel Raynal, Julien Stainer.

This work is on the construction and the use of a shared memory abstraction on top of an asynchronous message-passing system in which up to t processes may commit Byzantine failures. This abstraction consists of arrays of n single-writer/multi-reader atomic registers, where $n$ is the number of processes. Differently from usual atomic registers which record a single value, each of these atomic registers records the whole history of values written to it. A distributed algorithm building such a shared memory abstraction it first presented. This algorithm assumes $t < n / 3$ , which is shown to be a necessary and sufficient condition for such a construction. Hence, the algorithm is resilient-optimal. Then we present distributed algorithms built on top of this shared memory abstraction, which cope with up to $t$ Byzantine processes. The simplicity of these algorithms constitutes a strong motivation for such a shared memory abstraction in the presence of Byzantine processes. For a lot of problems, algorithms are more difficult to design and prove correct in a message-passing system than in a shared memory system. Using a protocol stacking methodology, the aim of the proposed abstraction is to allow an easier design (and proof) of distributed algorithms, when the underlying system is an asynchronous message-passing system prone to Byzantine failures.

This work was done in collaboration with Damien Imbs and Sergio Rajsbaum. It has been published in SIRROCCO [32] and as a technical report [43] .

Distributed Universality

Participants : Michel Raynal, Julien Stainer.

A notion of a universal construction suited to distributed computing has been introduced by M. Herlihy in his celebrated paper “Wait-free synchronization” (ACM TOPLAS, 1991). A universal construction is an algorithm that can be used to wait-free implement any object defined by a sequential specification. Herlihy’s paper shows that the basic system model, which supports only atomic read/write registers, has to be enriched with consensus objects to allow the design of universal constructions. The generalized notion of a $k$ -universal construction has been recently introduced by Gafni and Guerraoui (CONCUR 2011). A $k$ -universal construction is an algorithm that can be used to simultaneously implement $k$ objects (instead of just one object), with the guarantee that at least one of the $k$ constructed objects progresses forever. While Herlihy’s universal construction relies on atomic registers and consensus objects, a $k$ -universal construction relies on atomic registers and $k$ -simultaneous consensus objects (which are wait-free equivalent to $k$ -set agreement objects in the read/write system model). This work significantly extends the universality results introduced by Herlihy and Gafni-Guerraoui. In particular, we present a $k$ -universal construction which satisfies the following five desired properties, which are not satisfied by the previous $k$ -universal construction: (1) among the $k$ objects that are constructed, at least $ℓ$ objects (and not just one) are guaranteed to progress forever; (2) the progress condition for processes is wait-freedom, which means that each correct process executes an infinite number of operations on each object that progresses forever; (3) if any of the $k$ constructed objects stops progressing, all its copies (one at each process) stop in the same state; (4) the proposed construction is contention-aware, in the sense that it uses only read/write registers in the absence of contention; and (5) it is generous with respect to the obstruction-freedom progress condition, which means that each process is able to complete any one of its pending operations on the $k$ objects if all the other processes hold still long enough. The proposed construction, which is based on new design principles, is called a ( $k$ , $ℓ$ )-universal construction. It uses a natural extension of $k$ -simultaneous consensus objects, called ( $k$ , $ℓ$ )-simultaneous consensus objects (( $k$ , $ℓ$ )-SC). Together with atomic registers, ( $k$ , $ℓ$ )-SC objects are shown to be necessary and sufficient for building a ( $k$ , $ℓ$ )-universal construction, and, in that sense, ( $k$ , $ℓ$ )-SC objects are ( $k$ , $ℓ$ )-universal.

This work was done in collaboration with Gadi Taubenfeld. It has been published as a brief announcement in PODC [37] and the full version appeared in OPODIS [38] . A version has also been published as a technical report [45] .

Computing in the presence of concurrent solo executions

Participants : Michel Raynal, Julien Stainer.

In a wait-free model any number of processes may crash. A process runs solo when it computes its local output without receiving any information from other processes, either because they crashed or they are too slow. While in wait-free shared-memory models at most one process may run solo in an execution, any number of processes may have to run solo in an asynchronous wait-free message-passing model. This work is on the computability power of models in which several processes may concurrently run solo. It first introduces a family of round-based wait-free models, called the d-solo models, $1 \leq d \leq n$ , where up to $d$ processes may run solo. We then give a characterization of the colorless tasks that can be solved in each $d$ -solo model. We also introduce the ( $d$ , $ϵ$ )-solo approximate agreement task, which generalizes $ϵ$ -approximate agreement, and proves that ( $d$ , $ϵ$ )-solo approximate agreement can be solved in the $d$ -solo model, but cannot be solved in the $(d + 1)$ -solo model. We study also the relation linking $d$ -set agreement and ( $d$ , $ϵ$ )-solo approximate agreement in asynchronous wait-free message-passing systems. These results establish for the first time a hierarchy of wait-free models that, while weaker than the basic read/write model, are nevertheless strong enough to solve non-trivial tasks.

This work was done in collaboration with Maurice Herlihy and Sergio Rajsbaum. It has been published in LATIN [31] .

A simple broadcast algorithm for recurrent dynamic systems

Participants : Michel Raynal, Julien Stainer.

This work presents a simple broadcast algorithm suited to dynamic systems where links can repeatedly appear and disappear. The algorithm is proved correct and a simple improvement is introduced, that reduces the number and the size of control messages. As it extends in a simple way a classical network traversal algorithm to the dynamic context, the proposed algorithm has also pedagogical flavor.

This work was done in collaboration with Jiannong Cao and Weigang Wu. It has been published in AINA [36] .

Fisheye consistency: Keeping data in synch in a georeplicated world

Participants : Michel Raynal, François Taïani.

Over the last thirty years, numerous consistency conditions for replicated data have been proposed and implemented. Popular examples of such conditions include linearizability (or atomicity), sequential consistency, causal consistency, and eventual consistency. These consistency conditions are usually defined independently from the computing entities (nodes) that manipulate the replicated data; i.e., they do not take into account how computing entities might be linked to one another, or geographically distributed. To address this lack, as a first contribution, this work [41] introduces the notion of proximity graph between computing nodes. If two nodes are connected in this graph, their operations must satisfy a strong consistency condition, while the operations invoked by other nodes are allowed to satisfy a weaker condition. The second contribution is the use of such a graph to provide a generic approach to the hybridization of data consistency conditions into the same system. We illustrate this approach on sequential consistency and causal consistency, and present a model in which all data operations are causally consistent, while operations by neighboring processes in the proximity graph are sequentially consistent. The third contribution of this work is the design and the proof of a distributed algorithm based on this proximity graph, which combines sequential consistency and causal consistency (the resulting condition is called fisheye consistency). In doing so this work not only extends the domain of consistency conditions, but provides a generic provably correct solution of direct relevance to modern georeplicated systems.

This work was done in collaboration with Roy Friedman (The Technion, Haifa, Israel)

Previous |

Home | Next next